security analysis
Whose Narrative is it Anyway? A KV Cache Manipulation Attack
Ganesh, Mukkesh, Iyer, Kaushik, Ananthan, Arun Baalaaji Sankar
The Key Value(KV) cache is an important component for efficient inference in autoregressive Large Language Models (LLMs), but its role as a representation of the model's internal state makes it a potential target for integrity attacks. This paper introduces "History Swapping," a novel block-level attack that manipulates the KV cache to steer model generation without altering the user-facing prompt. The attack involves overwriting a contiguous segment of the active generation's cache with a precomputed cache from a different topic. We empirically evaluate this method across 324 configurations on the Qwen 3 family of models, analyzing the impact of timing, magnitude, and layer depth of the cache overwrite. Our findings reveal that only full-layer overwrites can successfully hijack the conversation's topic, leading to three distinct behaviors: immediate and persistent topic shift, partial recovery, or a delayed hijack. Furthermore, we observe that high-level structural plans are encoded early in the generation process and local discourse structure is maintained by the final layers of the model. This work demonstrates that the KV cache is a significant vector for security analysis, as it encodes not just context but also topic trajectory and structural planning, making it a powerful interface for manipulating model behavior.
- North America > United States (0.05)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Towards LLM-based Root Cause Analysis of Hardware Design Failures
Qiu, Siyu, Wang, Muzhi, Afsharmazayejani, Raheel, Shahmiri, Mohammad Moradi, Tan, Benjamin, Pearce, Hammond
--With advances in large language models (LLMs), new opportunities have emerged to develop tools that support the digital hardware design process. In this work, we explore how LLMs can assist with explaining the root cause of design issues and bugs that are revealed during synthesis and simulation, a necessary milestone on the pathway towards widespread use of LLMs in the hardware design process and for hardware security analysis. We find promising results: for our corpus of 34 different buggy scenarios, OpenAI's o3-mini reasoning model reached a correct determination 100% of the time under pass@5 scoring, with other state of the art models and configurations usually achieving more than 80% performance and more than 90% when assisted with retrieval-augmented generation. Encountering bugs, glitches, and faults is a normal part of the digital hardware design lifecycle. To ensure they are completely removed and repaired is a time-consuming process requiring a deep understanding of both the technical cause of the issue as well as any impacts on the broader hardware system - particularly as any missed repair may have severe downstream functional and/or security consequences [1] (if the bug is of an exploitable nature). However, as digital hardware grows in complexity, so do the frequency and nature of the bugs themselves.
- North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.14)
- Asia > India (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- (4 more...)
Threat Modeling for AI: The Case for an Asset-Centric Approach
Vicarte, Jose Sanchez, Spoczynski, Marcin, Elsaid, Mostafa
Recent advances in AI are transforming AI's ubiquitous presence in our world from that of standalone AI-applications into deeply integrated AI-agents. These changes have been driven by agents' increasing capability to autonomously make decisions and initiate actions, using existing applications; whether those applications are AI-based or not. This evolution enables unprecedented levels of AI integration, with agents now able to take actions on behalf of systems and users -- including, in some cases, the powerful ability for the AI to write and execute scripts as it deems necessary. With AI systems now able to autonomously execute code, interact with external systems, and operate without human oversight, traditional security approaches fall short. This paper introduces an asset-centric methodology for threat modeling AI systems that addresses the unique security challenges posed by integrated AI agents. Unlike existing top-down frameworks that analyze individual attacks within specific product contexts, our bottom-up approach enables defenders to systematically identify how vulnerabilities -- both conventional and AI-specific -- impact critical AI assets across distributed infrastructures used to develop and deploy these agents. This methodology allows security teams to: (1) perform comprehensive analysis that communicates effectively across technical domains, (2) quantify security assumptions about third-party AI components without requiring visibility into their implementation, and (3) holistically identify AI-based vulnerabilities relevant to their specific product context. This approach is particularly relevant for securing agentic systems with complex autonomous capabilities. By focusing on assets rather than attacks, our approach scales with the rapidly evolving threat landscape while accommodating increasingly complex and distributed AI development pipelines.
- North America > United States (0.14)
- North America > Mexico > Gulf of Mexico (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.86)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.66)
Review for NeurIPS paper: Security Analysis of Safe and Seldonian Reinforcement Learning Algorithms
Weaknesses: W1: The study seems to focus too much on algorithms that are based on safety tests. I understand that the analysis is not compatible, but maybe that would be worth it to include studies on how easy it is to trick those algorithms too. More generally (even for IS algorithms), it was a bit odd to me that the study does not consider attacks on the way pi_e is chosen. W2: It's unclear to me whether the trajectory must still have been performed in the real environment, or it can be completely be made up (but then its value has to be within the range [0,1]). Also, with model based methods (for both environment and policy models), it might be possible to single out the few trajectories that are inconsistent with the other trajectories.
Review for NeurIPS paper: Security Analysis of Safe and Seldonian Reinforcement Learning Algorithms
All the reviewers support acceptance for the contributions, notably improvements to the robustness of RL algorithms to adversarial attacks, and a clear exposition on how these methods can be applied to real world problems. Please consider revising the paper to address the concerns raised in the reviews and rebuttal, in particular to better explain the scope of the work. Separately, it may be useful to extend the broader impact statement to inform a casual reader that a mathematical safety guarantee on an algorithm is not a replacement for domain specific safety requirements (for example, the diabetes treatment would still need oversight for medical safety).
Security Analysis of Safe and Seldonian Reinforcement Learning Algorithms
We analyze the extent to which existing methods rely on accurate training data for a specific class of reinforcement learning (RL) algorithms, known as Safe and Seldonian RL. We introduce a new measure of security to quantify the susceptibility to perturbations in training data by creating an attacker model that represents a worst-case analysis, and show that a couple of Seldonian RL methods are extremely sensitive to even a few data corruptions. We then introduce a new algorithm that is more robust against data corruptions, and demonstrate its usage in practice on some RL problems, including a grid-world and a diabetes treatment simulation.
ROS2-Based Simulation Framework for Cyberphysical Security Analysis of UAVs
Patil, Unmesh, Gunasekaran, Akshith, Bobba, Rakesh, Abbas, Houssam
We present a new simulator of Uncrewed Aerial Vehicles (UAVs) that is tailored to the needs of testing cyber-physical security attacks and defenses. Recent investigations into UAV safety have unveiled various attack surfaces and some defense mechanisms. However, due to escalating regulations imposed by aviation authorities on security research on real UAVs, and the substantial costs associated with hardware test-bed configurations, there arises a necessity for a simulator capable of substituting for hardware experiments, and/or narrowing down their scope to the strictly necessary. The study of different attack mechanisms requires specific features in a simulator. We propose a simulation framework based on ROS2, leveraging some of its key advantages, including modularity, replicability, customization, and the utilization of open-source tools such as Gazebo. Our framework has a built-in motion planner, controller, communication models and attack models. We share examples of research use cases that our framework can enable, demonstrating its utility.
- North America > United States > Oregon (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia (0.04)
- Information Technology > Security & Privacy (1.00)
- Government (1.00)
- Transportation (0.86)
Demo: SGCode: A Flexible Prompt-Optimizing System for Secure Generation of Code
Ton, Khiem, Nguyen, Nhi, Nazzal, Mahmoud, Khreishah, Abdallah, Borcea, Cristian, Phan, NhatHai, Jin, Ruoming, Khalil, Issa, Shen, Yelong
This paper introduces SGCode, a flexible prompt-optimizing system to generate secure code with large language models (LLMs). SGCode integrates recent prompt-optimization approaches with LLMs in a unified system accessible through front-end and back-end APIs, enabling users to 1) generate secure code, which is free of vulnerabilities, 2) review and share security analysis, and 3) easily switch from one prompt optimization approach to another, while providing insights on model and system performance. We populated SGCode on an AWS server with PromSec, an approach that optimizes prompts by combining an LLM and security tools with a lightweight generative adversarial graph neural network to detect and fix security vulnerabilities in the generated code. Extensive experiments show that SGCode is practical as a public tool to gain insights into the trade-offs between model utility, secure code generation, and system cost. SGCode has only a marginal cost compared with prompting LLMs. SGCode is available at: http://3.131.141.63:8501/.
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.05)
- North America > United States > New Jersey > Essex County > Newark (0.05)
- North America > United States > Washington > King County > Redmond (0.04)
- (3 more...)
A Multi-Agent Security Testbed for the Analysis of Attacks and Defenses in Collaborative Sensor Fusion
Hallyburton, R. Spencer, Hunt, David, Luo, Shaocheng, Pajic, Miroslav
The performance and safety of autonomous vehicles (AVs) deteriorates under adverse environments and adversarial actors. The investment in multi-sensor, multi-agent (MSMA) AVs is meant to promote improved efficiency of travel and mitigate safety risks. Unfortunately, minimal investment has been made to develop security-aware MSMA sensor fusion pipelines leaving them vulnerable to adversaries. To advance security analysis of AVs, we develop the Multi-Agent Security Testbed, MAST, in the Robot Operating System (ROS2). Our framework is scalable for general AV scenarios and is integrated with recent multi-agent datasets. We construct the first bridge between AVstack and ROS and develop automated AV pipeline builds to enable rapid AV prototyping. We tackle the challenge of deploying variable numbers of agent/adversary nodes at launch-time with dynamic topic remapping. Using this testbed, we motivate the need for security-aware AV architectures by exposing the vulnerability of centralized multi-agent fusion pipelines to (un)coordinated adversary models in case studies and Monte Carlo analysis.
- North America > United States > Michigan (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- (6 more...)
- Transportation > Passenger (1.00)
- Transportation > Ground > Road (1.00)
- Information Technology > Security & Privacy (1.00)
- (2 more...)
FINER: Enhancing State-of-the-art Classifiers with Feature Attribution to Facilitate Security Analysis
He, Yiling, Lou, Jian, Qin, Zhan, Ren, Kui
Deep learning classifiers achieve state-of-the-art performance in various risk detection applications. They explore rich semantic representations and are supposed to automatically discover risk behaviors. However, due to the lack of transparency, the behavioral semantics cannot be conveyed to downstream security experts to reduce their heavy workload in security analysis. Although feature attribution (FA) methods can be used to explain deep learning, the underlying classifier is still blind to what behavior is suspicious, and the generated explanation cannot adapt to downstream tasks, incurring poor explanation fidelity and intelligibility. In this paper, we propose FINER, the first framework for risk detection classifiers to generate high-fidelity and high-intelligibility explanations. The high-level idea is to gather explanation efforts from model developer, FA designer, and security experts. To improve fidelity, we fine-tune the classifier with an explanation-guided multi-task learning strategy. To improve intelligibility, we engage task knowledge to adjust and ensemble FA methods. Extensive evaluations show that FINER improves explanation quality for risk detection. Moreover, we demonstrate that FINER outperforms a state-of-the-art tool in facilitating malware analysis.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Europe > Portugal > Braga > Braga (0.04)
- (4 more...)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)